Alleviating Linear Ecological Bias and Optimal Design with Sub-sample Data.

نویسندگان

  • Adam Glynn
  • Jon Wakefield
  • Mark S Handcock
  • Thomas S Richardson
چکیده

In this paper, we illustrate that combining ecological data with subsample data in situations in which a linear model is appropriate provides three main benefits. First, by including the individual level subsample data, the biases associated with linear ecological inference can be eliminated. Second, by supplementing the subsample data with ecological data, the information about parameters will be increased. Third, we can use readily available ecological data to design optimal subsampling schemes, so as to further increase the information about parameters. We present an application of this methodology to the classic problem of estimating the effect of a college degree on wages. We show that combining ecological data with subsample data provides precise estimates of this value, and that optimal subsampling schemes (conditional on the ecological data) can provide good precision with only a fraction of the observations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Alleviating Ecological Bias in Voter Turnout Models (and other Generalized Linear Models) with Optimal Subsample Design

In this paper, we illustrate that combining ecological data with subsample data in situations in which a generalized linear model (GLM) is appropriate provides two main benefits. First, by including the individual level subsample data, the biases associated with ecological inference in GLMs can be eliminated. Second, available ecological data can be used to design optimal subsampling schemes, s...

متن کامل

Alleviating Ecological Bias in Poisson Models using Optimal Subsampling: The Effects of Jim Crow on Black Illiteracy in the Robinson Data

In many situations data are available at the group level but one wishes to estimate the individual-level association between a response and an explanatory variable. Unfortunately this endeavor is fraught with difficulties because of the ecological level of the data. The only reliable solution to such ecological inference problems is to supplement the ecological data with individual-level data. ...

متن کامل

THE COMPARISON OF TWO METHOD NONPARAMETRIC APPROACH ON SMALL AREA ESTIMATION (CASE: APPROACH WITH KERNEL METHODS AND LOCAL POLYNOMIAL REGRESSION)

Small Area estimation is a technique used to estimate parameters of subpopulations with small sample sizes.  Small area estimation is needed  in obtaining information on a small area, such as sub-district or village.  Generally, in some cases, small area estimation uses parametric modeling.  But in fact, a lot of models have no linear relationship between the small area average and the covariat...

متن کامل

Strategies for monitoring and evaluation of resource-limited national antiretroviral therapy programs: the two-phase design

BACKGROUND In resource-limited settings, monitoring and evaluation (M&E) of antiretroviral treatment (ART) programs often relies on aggregated facility-level data. Such data are limited, however, because of the potential for ecological bias, although collecting detailed patient-level data is often prohibitively expensive. To resolve this dilemma, we propose the use of the two-phase design. Spec...

متن کامل

Using Inverse Probability Bootstrap Sampling to Eliminate Sample Induced Bias in Model Based Analysis of Unequal Probability Samples

In ecology, as in other research fields, efficient sampling for population estimation often drives sample designs toward unequal probability sampling, such as in stratified sampling. Design based statistical analysis tools are appropriate for seamless integration of sample design into the statistical analysis. However, it is also common and necessary, after a sampling design has been implemente...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of the Royal Statistical Society. Series A,

دوره 171 1  شماره 

صفحات  -

تاریخ انتشار 2008